Skip to content

updating loading in stable lm demo to use transformer bridge#1023

Merged
jlarson4 merged 36 commits intodev-3.x-canaryfrom
stable_lm_demo_transformer_bridge_migration
Apr 3, 2026
Merged

updating loading in stable lm demo to use transformer bridge#1023
jlarson4 merged 36 commits intodev-3.x-canaryfrom
stable_lm_demo_transformer_bridge_migration

Conversation

@degenfabian
Copy link
Copy Markdown
Collaborator

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@bryce13950 bryce13950 added this to the 3.0 milestone Aug 20, 2025
@bryce13950 bryce13950 changed the base branch from dev-3.x to dev-3.x-folding October 10, 2025 13:06
@jlarson4 jlarson4 changed the base branch from dev-3.x-folding to dev-3.x February 9, 2026 16:25
@jlarson4 jlarson4 self-assigned this Apr 3, 2026
@jlarson4 jlarson4 changed the base branch from dev-3.x to dev-3.x-canary April 3, 2026 17:29
@jlarson4
Copy link
Copy Markdown
Collaborator

jlarson4 commented Apr 3, 2026

I ended up having to rewrite a significant portion of this notebook. I was unable to recreate the saved data on this version or on 2.x's Hooked Transformer, I attempted to recreate the output on main or dev. As far as I am aware, this notebook became inaccurate long ago due to a bug in abstract attention.

In July 2023 when this demo was created in #354: components.py had the IGNORE buffer set to -1e5 (the bug). This is what produced the results that were demonstrated in the original version of the notebook.

In September 2023: IGNORE was changed to -torch.inf in addition to zeroing out NaNs caused by applying softmax to a vector of -infs. This occurred across PR #366 & PR #386.

The current behavior appears to be the correct behavior, and the original details of this demo were a side effect of the -1e5 buffer bug.

@jlarson4 jlarson4 merged commit b00a517 into dev-3.x-canary Apr 3, 2026
19 checks passed
@jlarson4
Copy link
Copy Markdown
Collaborator

jlarson4 commented Apr 3, 2026

Additionally, I left this commented out of CI for now. The StableLM Alphas are too large to run within the constraints of our current CI infrastructure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants